B-SCT: Improve SpMV processing on SIMD architectures

نویسندگان

  • Yaohua Wang
  • Dong Wang
  • Xu Zhou
چکیده

Sparse matrix-vector multiplication (SpMV) represents the dominant cost in sparse linear algebra. However, sparse matrices exhibit inherent irregularity in both amount and distribution of none-zero values. This harnessed the tremendous potential of Single Instruction Multiple Data (SIMD) architectures, which is widely adopted in nowadays data-parallel processors. To improve the performance of SpMV, we proposed the Balanced SCT (B-SCT) method. The cornerstones are composed of the balanced-aware compression scheme and the on-the-fly data re-order structure. Our simulation results show that the B-SCT method provides an average speed-up of 130% over the commonly used CSR method, and 83% over the SIMD-oriented SCT method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors

The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse ma...

متن کامل

Yet another Hybrid Strategy for Auto-tuning SpMV on GPUs

Sparse matrix-vector multiplication (SpMV) is a key linear algebra algorithm and is widely used in many application domains. Besides multi-core architecture, there is also extensive research focusing on accelerating SpMV on many-core Graphics Processing Units (GPUs). SpMV computations have many indirect and irregular memory accesses, and load imbalance could occur while mapping computations ont...

متن کامل

Accurate cross-architecture performance modeling for sparse matrix-vector multiplication (SpMV) on GPUs

This paper presents an integrated analytical and profile-based cross-architecture performance modeling tool to specifically provide inter-architecture performance prediction for Sparse Matrix-Vector Multiplication (SpMV) on NVIDIA GPU architectures. To design and construct the tool, we investigate the interarchitecture relative performance for multiple SpMV kernels. For a sparse matrix, based o...

متن کامل

Applications of the streamed storage format for sparse matrix operations

The streamed storage format for sparse matrices showed good performance improvement for sparse matrix and vector multiply (SpMV) compared with compressed sparse row (CSR) and block CSR (BCSR) formats, particularly on IBM Power processors. We extend the format to exploit single instruction multiple data (SIMD) instructions in order to utilize the vector unit, and discuss how the streamed formats...

متن کامل

Sparse-matrix vector multiplication on hybrid CPU+GPU platform

Sparse-matrix vector multiplication(Spmv) is a basic operation in many linear algebra kernels.So it is interesting to have a spmv on modern architectures like GPU. As it is a irregular computation CPU also performs compares to GPU. So it is interesting to have this routine in hybrid architectures like CPU+GPU.So we have designed a hybrid algorithm for Spmv which uses a CPU and a GPU. We have ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEICE Electronic Express

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2015